Value-function reinforcement learning in Markov games
نویسنده
چکیده
Markov games are a model of multiagent environments that are convenient for studying multiagent reinforcement learning. This paper describes a set of reinforcement-learning algorithms based on estimating value functions and presents convergence theorems for these algorithms. The main contribution of this paper is that it presents the convergence theorems in a way that makes it easy to reason about the behavior of simultaneous learners in a shared environment. 2001 Elsevier Science B.V. All rights reserved.
منابع مشابه
Value Function Approximation in Zero-Sum Markov Games
This paper investigates value function approximation in the context of zero-sum Markov games, which can be viewed as a generalization of the Markov decision process (MDP) framework to the two-agent case. We generalize error bounds from MDPs to Markov games and describe generalizations of reinforcement learning algorithms to Markov games. We present a generalization of the optimal stopping probl...
متن کاملLearning to control forest fires with ESP
Reinforcement Learning (Kaelbling et al., 1996) can be used to learn to control an agent by letting it interact with its environment. In general there are two kinds of reinforcement learning; (1) Value-function based reinforcement learning, which are based on the use of heuristic dynamic programming algorithms such as temporal difference learning (Sutton, 1988) and Q-learning (Watkins, 1989), a...
متن کاملUtilizing Generalized Learning Automata for Finding Optimal Policies in MMDPs
Multi agent Markov decision processes (MMDPs), as the generalization of Markov decision processes to the multi agent case, have long been used for modeling multi agent system and are used as a suitable framework for Multi agent Reinforcement Learning. In this paper, a generalized learning automata based algorithm for finding optimal policies in MMDP is proposed. In the proposed algorithm, MMDP ...
متن کاملMultiagent Reinforcement Learning in Stochastic Games
We adopt stochastic games as a general framework for dynamic noncooperative systems. This framework provides a way of describing the dynamic interactions of agents in terms of individuals' Markov decision processes. By studying this framework, we go beyond the common practice in the study of learning in games, which primarily focus on repeated games or extensive-form games. For stochastic games...
متن کاملA Unified Analysis of Value-Function-Based Reinforcement Learning Algorithms
Reinforcement learning is the problem of generating optimal behavior in a sequential decision-making environment given the opportunity of interacting with it. Many algorithms for solving reinforcement-learning problems work by computing improved estimates of the optimal value function. We extend prior analyses of reinforcement-learning algorithms and present a powerful new theorem that can prov...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Cognitive Systems Research
دوره 2 شماره
صفحات -
تاریخ انتشار 2001